# User Survey

## Compare the difficulty of performing the task depending on structured information present in bug reports?

### U1

It was easier to process bug reports when source code was contained, especially when corrections or additions to existing classes were given in the report. This gives a high plausabiltiy for the bug being at least partially located in the corrected code segment. The user that made the report obviously looked into the code and had a better understanding of it than I did.

### U2

Depending on how detailed the bug description was(if it was directly mentioned where to make modifications).  Otherwise it was more easy to process bug reports containing source code snippets.

### U3

It seemed easier to work withthout code snippets. But in my opinion it depends on the type of bug. For some reports were more specific than others (e.g. some stated the intended behaviour, others didn't).

### U4

It is easier to find them with code snippets, but without any real experience its hard of course.

### U5

.. I cannot say, would have expected yes, but afterwords, I am not sure ...

### U6

Es war einfacher, da man direkt nach einem keyword, methoden name, oder attribut name suchen konnte. Bei Freitext musste man verstehen, was gemeint ist. Dies macht es für mich als externen sehr schwierig einzuschätzen, wo ein Bug sein könnte.

### U7

more easy

### U8

With code snippets it felt easier to actually understand what the non-structured text was talking about because I had some kind of example.
On the other hand, it subjectively did not help much for the actual localization.
That depended for me more on how precise the author already tracked down the issue and explained why he thought that that specifically is the cause.

### U9

In both cases localisations seemed very hard without having worked at any of those projects. The reports containing no source code felt like i only had very few information to work with, therefore felt even harder.

### U10

It was way easier to process bug reports containing actual source code snippets indicating where the bug might be and some even providing snippets for the fix. 1-4 was much more difficult.

### U11

It was easier to localize the errors with code snippets in the tickets, also because most of the time, the class / file containing the error is also named.

### U12

... Challenging ...

### U13

Easier

### U14

It was easier to locate code snippets then comments only, because you can ignore single keywords and focus on marked code blocks.

### U15

Source code snippets in the bug reports definitely help finding the position in the code. Especially for someone who did not write the (buggy) code it helps to understand the problem.

### U16

easier

### U17

way easier with source code snippets

### U18

It was more easy to process bug reports containing source code snippets.

## What is your opinion about semantic code highlighting?

### U1

The code coloring was useful when marking variable names or custom types. Markings of simple types were more distracting than helpful.

### U2

It helped in cases where parts of the code snippets (ie method names) were colored

### U3

A bit, but in some cases I felt like it distracted me because it caused me to focus on the wrong parts of code-

### U4

It was very helpful to search for specific code segments.

### U5

... yes I think, but not sure if I identified the right ones. It was really difficult for me ...

### U6

Kommt darauf an. Da leider zu of keywords wie "new", "case", "String" oder nichtssagende Variablen namen wie "name" highlighted wurde, haben diese nur verwirrt. Wenn jedoch aussagekräftige Parameternamen wie "exclude" hervorgehoben wurden, dann war es sehr hilfreich.

### U7

I liked it

### U8

I mostly used ctrl+f to search.

### U9

It felt very helpful indeed. Still, in a bug localisation tool i would wish for a feature to hide highlighting a know as false positive. (Like "new", the natural language "method" etc.)

### U10

I liked the highlighting.

### U11

For the first 4 tickets it was moderately helpful. The natural language highlighting might benefit from better documented code, the code highlighting was a bit eager if a keyword was common, which dimmed its helfpulness. For the last 4 tickets the coloring only saved a few seconds, as the code was already in the ticket, but just scrolling through the file and looking for the big highlighted block was slightly helpful.

### U12

... upto certain extent ...

### U13

Often the code coloring pinpoints to relevant locations within the source files. On the other hand, also seemingly irrelevant keywords are highlighted, e.g., null/param/class/if/new/case.

### U14

A single color would be enough.

### U15

In these scenarios the coloring was not very helpful as often all keywords (if, String, ...) were highlighted. This was only confusing and I did not look at the colors at all, but rather used the Find function in the browser to look for keywords from the bug report ("firstIndex", "scriptState", ...). Also, I did not understand the purple coloring.

### U16

Coloring was not helpful

### U17

a bit, for getting a first impression where multiple matches are in close proximity to each other. mostly for longer tokens though; the shorter, more common ones are a bit distracting.

### U18

the code coloring simplify the bug localization task.

## Any suggestions to further simplify the bug localization task?

### U1

I missed documentation of the given code to understand what the classes are for. Knowing what the code is doing makes it easier to localize the bug based on the information given in the natural language text. Understanding the code by reading it is hard, especially when the architecture of the full project is unknown.

### U2

I don't have any suggestions

### U3

I missed lots of project context. I think people who actually know intended behaviours, usually have an much easier time. I think I made lots of errors because I lack this expertise and don't work with Java/such big projects regulary.

### U4

The description of a bug should contain a bit more information to really find a bug, but I guess for people that actually work with the code they have no problem if the description looks like here.

### U5

(no answer given)

### U6

Ich hätte mir gerne gewünscht mit welcher Wahrscheinlichkeit/Sicherheit die jeweiligen Files ausgewählt wurden. Das würde mir etwas zeit einsparen, die jeweiligen Files intensiver anzuschauen.  Man könnte auch direkt auf die Methoden oder Zeile/n gelinkt werden, wo vermutet (hohe Sicherheit/Wkt) wird, wo der Bug sein könnte.

### U7

more percise code detection
clickable links in the description/summary where the snippets were found

### U8

I missed the navigation support of a real IDE (e.g. jump to declaration)
It would be also nice to be able to exlude some words form the highlighting tool.

### U9

I would like to know why a certain file was ranked high on the list so that a can verify the reason. I often times started guessing or looking for typical indicators of a false positive.

### U10

Doing a keyword search on the source files myself with the browser's page search functionality helped with going through the file and occurences of the keyword. So maybe a way to not only scroll and skim through the highlighted source files to detect relevant parts but being able to search for keywords by oneself might be a helpful addition.

### U11

I think most bugs are located in 1-2 files at maximum, so it may be possible to only suggest up to 3 files (even less when the algorithm has some kind of confidence threshold). The next logical step beside suggesting a file is of course the suggestion of specific code sections. As this is probably very difficult to achieve confidently, a simple feature which directly shows some "keyword dense areas" might be possible and possibly helpful. This way you dont have to scroll through multiple hundred lines of code before finding the interesting (highlighted) parts. The way the tickets were phrased and chosen, I dont think this tool in its current state is too helpful if you are very familiar with the code base. However, I can see this being helpful for big project where the maintainer of some module is unavailable while bugs are being filed for said module.

On a side note, for the study itself, some kind of "Back"-Button that brings you back to the ticket selection might improve the workflow within the study.

### U12

... NA ...

### U13

It seems the algorithm prefers long source files where the absolute count of matching words is comparably high. Especially if irrelevant keywords are matched, this results in the prediction of long source files. Potential improvements: The density of matches could be taken into account instead of the absolute count. A „prediction score“ could be added to the list of recommended files in order to avoid parsing irrelevant files.

### U14

Ignore single coding keywords like 'String', 'return', 'int', 'if' (see report 7 file 2).
Only highlight them if they are part of a block or followed by other keywords in the bug report.

### U15

Maybe the code editor could tell me why the localization algorithm decided to put this file in the top 5 - maybe highlight the respective section/snipped. Additionally, I frequently use the Ctrl-Click function of IDEs to navigate through code - but that's editor stuff ;)

### U16

People writing better bug reports.

### U17

it seems some files were ranked high simply because they are larger than others and simply because of that lead to a high number of matches somewhere in the file which are not necessarily linked.

### U18

right now not further suggestions.